Multiple Retrieval Models and Regression Models for Prior Art Search

نویسندگان

  • Patrice Lopez
  • Laurent Romary
چکیده

This paper presents the system called PATATRAS (PATent and Article Tracking, Retrieval and AnalysiS) realized at the Humboldt University for the IP track of CLEF 2009. Our approach presents three main characteristics: 1. The usage of multiple retrieval models (KL, Okapi) and term index definitions (lemma, phrase, concept) for the three languages considered in the present track (English, French, German) producing ten different sets of ranked results. 2. The merging of the different results based on multiple regression models using an additional validation set created from the patent collection. 3. The exploitation of patent metadata and of the citation structures for creating restricted initial working sets of patents and for producing a final re-ranking regression model. As we exploit specific metadata of the patent documents and the citation relations only at the creation of initial working sets and during the final post ranking step, our architecture remains generic and easy to extend.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Experiments with Citation Mining and Key-Term Extraction for Prior Art Search

This technical note presents the system built for the IP track of CLEF 2010 based on PATATRAS (PATent and Article Tracking, Retrieval and AnalysiS), the modular search infrastructure initially realized for CLEF IP 2009. We largely reused the system of the previous CLEF IP but at a relatively smaller scale and with the improvement of three main components: • A new citation mining tool based on C...

متن کامل

توانایی مدل مبتنی بر تحلیل لوجیت در پیش‌بینی درماندگی مالی و تأثیر متغیر کارایی در بهبود مدل

Nowadays operational research has become a part of financial re-search. One way that operational research can be used in financial re-search is designing financial distress prediction models. In this re-search we designed a financial distress prediction model based on lo-gistic regression and on financial ratios and efficiency scores. For this purpose, we designed and tested two models: 1- fina...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

EVALUATION OF CONCRETE COMPRESSIVE STRENGTH USING ARTIFICIAL NEURAL NETWORK AND MULTIPLE LINEAR REGRESSION MODELS

In the present study, two different data-driven models, artificial neural network (ANN) and multiple linear regression (MLR) models, have been developed to predict the 28 days compressive strength of concrete. Seven different parameters namely 3/4 mm sand, 3/8 mm sand, cement content, gravel, maximums size of aggregate, fineness modulus, and water-cement ratio were considered as input variables...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/0908.4413  شماره 

صفحات  -

تاریخ انتشار 2009